Topological reversibility and causality in feed-forward networks 
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Systems whose organization displays causal asymmetry constraints, from evolutionary trees to 
river basins or transport networks, can be often described in terms of directed paths (causal flows) 
on a discrete state space. Such a set of paths defines a feed-forward, acyclic network. A key problem 
associated with these systems involves characterizing their intrinsic degree of path reversibility: 
given an end node in the graph, what is the uncertainty of recovering the process backwards until 
the origin? Here we propose a novel concept, topological reversibility, which rigorously weigths 
such uncertainty in path dependency quantified as the minimum amount of information required to 
successfully revert a causal path. Within the proposed framework we also analytically characterize 
limit cases for both topologically reversible and maximally entropic structures. The relevance of 
these measures within the context of evolutionary dynamics is highlighted. 
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I. INTRODUCTION 

Causality is the fundamental principle pervading dy- 
namical processes. Any set of time-correlated events, 
from the development of an organism to historical 
changes, defines a feed-forward structure of causal re- 
lations captured by a family of complex networks called 
directed acyclic graphs (DAGs). Their structure has re- 
cently attracted the interest of researchers p]-[4] since 
DAGs represent time-ordered processes as well as a broad 
number of natural and artificial systems. Examples 
would include simple electronic circuits [5], feed- forward 
neural [6 and transmission networks [7], river basins |5], 
or even some food webs and chemical structures [9]. 

A paradigmatic example of a causal structure is the 
chart of the relations among states followed by a com- 
putational process through time. Intimately linked to 
the topology of the computational chart of consecutive 
states, a fundamental feature of computations is its de- 
gree of logical reversibility [10j[TT]. Indeed, it is said that 
a process is logically reversible when, if reverting the flow 
of causality, i.e. going backwards from the computational 
outputs to their inputs, we can unambiguously recover 
the causal structure of the process. Roughly speaking, 
if we have a computer performing a function g : N — >> N 
and we can unambiguously determine the input u from 
the only knowledge of the value v — g(u), we say that 
the function is logically reversible. Otherwise, if there is 
uncertainty in determining u from the only knowledge of 
v, we say that the function is logically irreversible, and 
thus, additional information is needed to successfully re- 
construct a given computational path. 

Analogously, the potential scenarios emerging from an 
evolutionary process raise similar questions. Within evo- 
lutionary biology, a relevant problem is how predictable 
is evolutionary dynamics. In particular, it has been 



asked what would be the result of going backwards and 
"re-playing the tape of evolution" JT2J [15] . Since this 
question pervades the problem of how uncertain or pre- 
dictable is a given evolutionary path, it seems desirable 
to actually provide a foundational framework. 

In this paper, we analytically extend the concept of 
logical reversibility to the study of any causal structure 
having no cyclic topologies, thereby defining a broader 
concept to be named topological reversibility. Whereas 
thermodynamical irreversibility implies thermodynami- 
cal entropy production p~4j [15] , topological irreversibil- 
ity implies statistical entropy production. In general, we 
will say that a DAG is topologically reversible if we can 
unambiguously recover a path going backwards from any 
element to the origin. Genealogies and phylogenies are 
examples of tree-like structures where a chronological or- 
der can be established among the events and an unam- 
biguous reconstruction of the lineage can be performed 
for every element of the graph [16 . Following this argu- 
ment, we will label a graph as topologically irreversible 
when some uncertainty is observed in the reconstruction 
of trajectories. 

As shown below, the entropy presented here weigths 
the extra amount of information that would be required 
to recover the causal flow backwards. Information mea- 
sures are not new in the study of complex networks [T7] - 
[23] . although such measures accounted for connectivity 
correlations [18] [19] [21] [22] or were used to character- 
ize a Gibbsian formulation of the statistical mechanics of 
complex networks [17 . We finally note that the starting 
point of our formalism resembles the classical theory of 
Bayesian networks. However, the particular treatment of 
reversibility proposed here is qualitatively different from 
the concept of uncertainty used in such a framework and 
closer to the one described in [20] . 

The paper is organized as follows: In section [II] we pro- 
vide the basic concepts underlying our analytical deriva- 



tions. Section III provides the general mathematical def- 
inition of topological reversibility and the general expres- 
sion for the average uncertainty associated to the rever- 
sion of the causal flow. This is consistently derived from 
the properties of the adjacency matrix. In section [IV| we 
consider two limit cases, finding the exact analytic form 
for their entropies and predicting the uncertain configu- 
ration. Finally, in section[V|we outline the generality and 
relevance of our results in terms of characterizing DAG 
structure. 



II. THEORETICAL BACKGROUND 

The theoretical roots of this paper stem from fun- 
damental notions of directed graph theory [24] [25], or- 
dered set theory [26, 27] and information theory [28H5T] . 
Specifically, we make use of Shannon's entropy which, as 
originally defined, quantifies the uncertainty associated 
to certain collections of random events [28, 30] . In our 
framework, the entropy in a given feed-forward graph 
measures the uncertainty in reversing the causal flow de- 
picted by the arrows [39 . 



A. Directed graphs and orderings 

Let G(V,E) be a directed graph, being V = 
{i>i, ...,i> n }, \V\ = n, the set of nodes, and E = 
{(vk,Vi),...,(vj,vi)} the set of edges -where the order, 
(vk,Vi) implies that there is an arrow in the following 
direction: v k — >• V{. Given a node V{ E V, the num- 
ber of outgoing links, to be written as k out {vi), is called 
the out- degree of V{ and the number of ingoing links of 
Vi is called the in-degree of i^, written as ki n (vi). The 
adjacency matrix of a given graph 5, A(Q) is defined 
as Aij(G) = 1 ^ ( v ii v j) £ E; and Aij(Q) = other- 
wise. Through the adjacency matrix, ki n and k out are 
computed as 



kin{ v i) 



j2 A Ji@y> k out(v t ) = j2 A ij($)- (^ 



Furthermore, we will use the known relation between the 
fc-th power of the adjacency matrix and the number of 
paths of length k going from a given node V{ to a given 
node Vj Specifically, 



k times 



(A(g% = (A(g) x ... x a(5))« 

is the number of paths of length k going from node V{ to 
node Vj [25] . 

A feed-forward or directed acyclic graph is a directed 
graph characterized by the absence of cycles: If there 
is a directed path from v\ to v k (i.e., there is a finite 
sequence (vi,Vj),(vj,vi),(vi,v 3 ),...,(v m ,v k ) E E) then, 
there is no directed path from v k to V{. Conversely, the 



matrix A T (Q) depicts a DAG with the same underlying 
structure but having all the arrows (and thus, the causal 
flow) inverted. Given its acyclic nature, one can find a 
finite value L(Q) as follows: 

L(G) = max{fc : (3v i: Vj € V : (A(0))& ^ 0)}. (2) 

It is easy to see that L(Q) is the length of the longest 
path of the graph. The existence of such L(Q) can be 
seen as a test for acyclicity. However, the use of leaf- 
removal algorithms [32] [33], i.e. the iterative pruning 
of nodes without outgoing links, is by far more suitable 
than the above method, in terms of computational costs. 
In a DAG, a leaf-removal algorithm removes completely 
the graph in a finite number of iterations, specifically, in 
L(Q) iterations -see eq. pi). 

Now we study the interplay between DAGs and order 
relations. Borrowing concepts from order theory [27], we 
define the following set: 



M = {vi EV :k in (vi) = 0}, 



(3) 



to be named the set of maximal nodes of Q : by which 
\M\ = m. The set of all paths 7Ti, ...,7r s , s > \E\, from M 
to a given node vi E V \ M is indicated as 11(5). Given 
a node V{ E V \ M, the set of all paths from M to V{ is 
written as II(^) C 11(5). Furthermore, we will define the 
set v(irk) as the set of all nodes participating in this path, 
except the maximal one. Additionally, one can define the 
set of nodes with k out = as the set of minimal nodes 
of 5, to be named \i. Notice that the absence of cycles 
implies that m > 1 and that the set of minimals \i must 
also contain at least one element -see fig. (Ilk). 

Attending to the node relations depicted by the ar- 
rows, and due to the acyclic property, at least one node 
ordering can be defined, establishing a natural link be- 
tween order theory and DAGs. This order is achieved by 
labeling all the nodes with sequential natural numbers 
and obtaining a configuration such that: 



i^^v 3 )EE){i<3). 



(4) 



Accordingly, DAGs are ordered graphs [2 . However, as 
order relations imply transitivity, it is not the DAG but 
its transitive closure what properly defines the order rela- 
tion among the elements of V. The transitive closure of Q 
(see fig. [TJd), to be written as T(Q) = (Vr, E T ) is defined 
as follows: Any pair of nodes v^vj^ E V by which there 
is at least one path going from V{ to Vk are connected 
through a link (vi,Vk) in T{Q). In this framework, for a 
given number of maximal nodes, in the transitive closure 
the addition of a link either creates a cycle or destroys a 
maximal or minimal node. If the pairs defining the set of 
links of T(G) are conceived as the elements of a set rela- 
tion Et C V x V, such a relation satisfies the following 
three properties: 

i) ${vk,v k ), 

n) ((vi,v k ) EE T ) => ((v k ,Vi) £E T ), 

in) ({vi,v k } E E T A (v k ,Vj) E E T ) => ((v^Vj) E E T ). 



Vi 



V2 V3 



a 





FIG. 1: Some illustrative DAGs. A topologically irreversible DAG Q(V,E), where M denotes the set of maximals, fi the set 
of minimals and the V \ M set the set of non-maximals (a). The respective transitive closure, T(G) is shown in (b), A linear 
ordering of the set V\M of G(V, E) is displayed in (c) where any node of the maximal set is connected to any node of the set 
V \ M. This is an special structure displaying maximal entropy (see text). 



The DAG definition implies that E directly satisfies the 
two first conditions whilst the third one (transitivity) is 
only warranted for Et- Thus, only Et holds all require- 
ments to be an order relation, specifically, a strict partial 
order. The transitive closure of a given DAG can be ob- 
tained by means of the so-called WarshaWs algorithm 

12a. 

Finally, a subgraph T (Vjr ^ Ejr) C Q is said to be lin- 
early ordered or totally ordered provided that for all pairs 
of nodes v^v^ G Vjr such that k < i, then 



(v k ,Vi) G Ejr. 



(5) 



Let us notice that if we understand Ejr as a set relation 
Ejr C Vjr x Vjr, Ejr is a strict linear order. If Q is linearly 
ordered and W C 5, we refer to Q as a topological sort of 
W [251. 



B. Uncertainty 

According to classical information theory [28-31], let 
us consider a system S with n possible states, whose oc- 
currences are governed by a random variable X with an 
associated probability mass function formed by pi, ...,p n . 
According to the standard formalization, the uncertainty 
or entropy associated to X, to be written as H(X), is: 



H(X) = -^pilogpi, 



(6) 



i<n 



which is actually an average of \og(l/p(X)) among all 
events of 5, namely, H(X) = (log(l/p(X))), where (...) 
is the expectation or average of the random quantity 



between parentheses. As a concave function, the en- 
tropy satisfies the so-called Jensen's inequality [29] , which 
reads: 



los ^))- los (^))- logn ' 



(7) 



The maximum value logn is achieved for pi = 1/n for 
all i = (l,...,n). Jensen's inequality provides an upper 
bound on the entropy that will be used below. Anal- 
ogously, we can define the conditional entropy. Given 
another system S f containing n r values or choices, whose 
behavior is governed by a random variable Y, let F(s[\sj) 
be the conditional probability of obtaining Y = s^ G S f if 
we already know X = Sj G S. Then, the conditional en- 
tropy of Y from X, to be written as H(Y\X), is defined 
as: 

H(Y\X) = - J2pj E P(^ki)logP(^ki). (8) 



j<n 



i<n' 



which is typically interpreted as a noise term in informa- 
tion theory. Such a noise term can be interpreted as the 
minimum amount of extra bits needed to unambiguously 
determine the input set from the only knowledge of the 
output set. This will be the key quantity of our paper, for 
it accounts for the dissipation of information in a given 
process. 



III. TOPOLOGICAL REVERSIBILITY AND 
ENTROPY 

Let us imagine that a node Vi G V \ M of a given DAG 
(5, receives the visit of a random walker that follows the 



a 




starting the reversion process from a given node V{ G 
V \ M, to be written as h(vi): 

Hvi) = ~ Y, ^k\vi)log¥(n k \vi) (10) 

The overall uncertainty of Q, written as H(Q), is com- 
puted by averaging h over all non-maximal nodes, i.e: 

H(G) = - J2 P( y i) J2 V(*k\vi)logV(ir k \vi) 

viev\M 7r k eu(vi) 

= J2 P( v i)Hvi). (11) 

Vi ev\M 



FIG. 2: Uncertainty in the reversal of causal flows in a DAG. 
Notice that more than a pathway, with more or less proba- 
bility to be chosen, connect maximals from each terminal (a). 
Given a node (vq) receiving two inputs, we consider two dif- 
ferent alternatives to go backwards. The uncertainty in thi s 
particular case is obtained by computing Hl (vi) from eq. ( 14 ) , 
i.e., Jil(v6) = log 2 assuming equiprobability in the selection 
(b) 



flow chart depicted by the DAG. We only know that it 
began its walk at a given maximal node and it followed a 
downstream random path attending to the directions of 
the arrows to reach the node V{. Suppose also that the 
global structure of the graph is unknown. What is the 
uncertainty associated to the followed path? In other 
words, what is the amount of information we need, on 
average, to successfully perform the backward process? 



A. The definition of entropy 

As we mentioned above, the starting point of our 
derivation is close to treatment of Bayesian networks [34] . 
In our approach, the first task is to define the probabil- 
ity to follow a given path i\ k G II(i^) when reverting the 
process. Let v(7Tk) be the set of nodes participating in 
the path ir k except the maximal ones. Maximal nodes 
are not included in this set because they are the ends of 
the path of the reversal process. The probability to chose 
such a path from node V{ by making a random decision 
at every crossing when reverting the causal flow will be: 



P(TTfcK) 



n 



i 



Viev(7T k ) 



Kin \ v j ) 



(9) 



Consistently: 




n 

Vjev(-Kk) 



nn Wj) 



1. 



As P is a probability distribution, we can compute the 
uncertainty associated to a reversal of the causal flow, 



B. The transition matrix <I> and its relation to the 
adjacency matrix 

The main combinatorial object of our approach is not 
the adjacency matrix but instead a mathematical repre- 
sentation of the probability to visit a node Vi G V \ M 
starting the backward flow from a given, different node 
v k G V \ M regardless the distance separating them. As 
we shall see, this combinatorial information can be en- 
coded in a matrix, to be named transition matrix <£ and 
we can explicitly obtain it from A(Q). We begin by defin- 
ing 



V(U( Vj )) = |J V (7T fc ), 

7T k eU(Vj) 

and we can see that: 

7T k eU(Vi) 



(12) 



E 

7r k eu(v z ) 



^ F(7r k \vi)log(ki n (vj)) 

Vjev(-K k ) 



Yl log(/c in (^-)) 

v 3 ev(u z ) 

= Y <f>ik(Q) h L(v k ). 



-K k :vjev(7r k ) 



(13) 



v k ev\M 



Let us explain eq. (13) and its consequences. First we 
define hi,{vi) as: 



h L (vi) = log(k in (vi))., 



(14) 



where L indicates the amount of local entropy intro- 
duced in a given node when performing the reversion 
process -see fig ([2|. Thereby, it is the amount of in- 
formation needed to properly revert the flow backwards 
when a bifurcation point is reached having ki n possible 
choices. Secondly, we define <pi k as the coefficients of a 
(n — m) x (n — m) matrix ®(G) = [</>ik(G)], i.e. our tran- 
sition matrix Q: 

n k --Vj£v(7r k ) 



This represents the probability to reach Vj starting from 
Vi. Now we derive the general expression for <1>. The 
derivation allows us to obtain a consistent mathematical 
definition of the transition matrix in terms of A(G). We 
first notice two important facts linking paths and the 
powers of the adjacency matrix that are only generically 
valid in DAG-like networks. First, we observe that: 

i n (*oi= E E ( aT (£))^ ( 15 ) 

j<L{g)l: Vl eM 

being L(Q) the length of the longest path of the graph 
as defined by pi). Analogously, the number of paths of 
U(vi) crossing Vk, to be written as a ik is: 



<*ik 



\{ttj e U(vi) : v k e Vi(7Tj)}\ 

E ( AT (^))- fe - ( 16 ) 



3<L{Q) 



The above quantities provide the number of paths. To 
compute the probability to reach a given node, we have to 
take into account the probability to follow a given path 
containing such a node, defined in wn. To rigorously 
connect it to the adjacency matrix, we first define an 
auxiliary, (n — m) x (n — m) matrix B(£), namely: 



B(g) ij = (A ij (g))\J2^j(0) 

Kj<n 






(17) 



where v^Vj G V \ M. From this definition, we obtain 
the explicit dependency of <1> from the adjacency matrix, 
namely [40 , 



m= E ( BT (<-- 

k<L(Q) 



(18) 



and accordingly, we have 



MO) = (B T (G))l = i. 



It is worth to mention that $>(G) resembles the transi- 
tion matrix related to the concept of information mobil- 
ity [20]. In the general case of non-directed graphs, one 
can assume the presence of paths of arbitrary length, 
which leads (using a correction factor tied to the length 
of the path) up to an asymptotic form of the transition 
matrix in terms of the exponential of the adjacency ma- 
trix. However, the intrinsic finite nature of the paths in 
a given DAG makes the above asymptotic treatment non 
viable. 



C. The general form of the Entropy 

Let us now define the overall entropy in a compact 
form, only depending on the adjacency matrix of the 
graph. From eqs. (|8| [llj [l3[ ) , we obtain 

H(G)= J2 PM E <t>ik(G)h L {v k ). (20) 

v x ev\M v k ev\M 



This is the central equation of this paper. This measure 
quantifies the additional information (other than topo- 
logical one) to properly revert the causal flow. We ob- 
serve that this expression is a noise term within stan- 
dard information theory [28]. In this equation we have 
been able to decouple the combinatorial term associated 
to the multiplicity of paths at one hand, and the par- 
ticular contribution to the overall uncertainty of every 
node, at the other hand. The former is fulfilled by the 
matrix <I>, which encodes combinatorial properties of the 
system, and how they influence in the computation of 
the entropies. The latter is obtained from the set of local 
entropies 1il(vi), ...,hL(v n - m ). These terms account for 
the contribution of local topology -i.e. the uncertainty 
when choosing an incoming link at the node level in the 
reversion of the causal flow- to the overall entropy. This 
uncoupling is a consequence of the extensive property 
of the entropy and, putting aside its conceptual inter- 
est, simplifies all derivations related to the uncertainties, 
since we are not forced to compute the complex series 
arising in the brute- force calculation of entropies. This 
general expression of the entropy can be simplified if we 
assume that \/vi EV\ M, p{vi) = l/(n — m). Therefore, 
by defining 



W)= E E <f>ik(Q)h L (v k ) 



Vi ev\M v k ev\M 
and thus H(G) is expressed as: 



H(G) = 



n — m 



-Q(G) 



(21) 



(22) 



Finally, we recall that the above entropy is bounded by 
Jensen's inequality (|7|) i.e., 

H(G)<^— Yl iog(|n(^)D- (23) 



n — m 



viev\M 



(19) Notice that the quantity on the right side of eq. (23) is 



the uncertainty obtained by considering all paths from 
M to Vi equally likely to occur. 



D. Topological reversibility 

Having defined an appropriate and well grounded en- 
tropy measure, now we can discuss the meaning of topo- 
logical (ir) reversibility. Let us first make a qualitative 
link with standard theory of irreversible thermodynam- 
ics, where irreversibility is tied to the parameter of en- 
tropy production cr s in the entropy balance equation [15] . 
Here, cr s = depicts thermodynamically reversible pro- 
cesses, whereas a s > appears in irreversible processes 
P~4] [15] . Irreversibility is rooted in the impossibility of re- 
verting the process without generating a negative amount 
of entropy, which contradicts to the second law of ther- 
modynamics. Consistently, we will call topologically re- 
versible those DAG structures such that 

H(G) = 0. 



In those structures (they belong to the set of trees, as 
we shall see in the following section) no ambiguity arises 
when performing the reversion process. On the contrary, 
a given DAG by which 



H(Q) > 



will be referred to as topologically irreversible. DAGs 
having H(Q) > display some degree of uncertainty tak- 
ing the causal flow backwards, since the reversion pro- 
cess is subject to some random inevitable decisions. In 
these cases, H(Q) is the average of the amount of ex- 
tra information needed to successfully perform the pro- 
cess backwards. Similarly, the successful reversion of a 
thermodynamically irreversible process would imply the 
(irreversible) addition of external energy, or that the re- 
version of a logically irreversible computation requires an 
extra amount of external information to solve the ambi- 
guity arising in rewinding the chain of computations. In 
this context, for example, reversible computation is de- 
fined by considering a system of storage of history of 
the computational process [TO] , Furthermore, we ob- 
serve that, roughly speaking, we can associate the logical 
(ir) reversibility of a computational process to the topo- 
logical (ir) reversibility of its DAG representation. In our 
study, the adjective topological arises from the fact that 
we only use topological information to compute the un- 
certainty. Thus, we deliberately neglect the active role 
that a given node can play as, for example, a processing 
unit, or the different weights of the paths. However, it 
is worth to mention that entropy can be generalized for 
DAGs where links are weighted by a probability to be 
chosen in the process of reaching the maximal. 





FIG. 3: A topological reversible structure featured by a tree 
DAG structure, H(Q) = (a). A topologically irreversible 
DAG featured by a star DAG with m = n — 1. Notice that 
for a star graph H(Q) = log(n — 1) where n = 7 in this 
particular case (b). 



A. Zero Uncertainty: Trees 

Imagine a random walker exploring a (directed) tree 
containing only a single maximal (fig. [3k). From such 
a maximal node, there exists only one path to a given 
node. In the evolutionary context, a single ancestor is 
at the root of all evolutionary tree [35]. Thus, the pro- 
cess of recovering the history of the random walker up to 
its initial condition is completely deterministic, and no 
uncertainty can be associated to it -in purely topological 
terms. Formally, we recognize two defining features on 
trees, namely: 



• m 



1 



(yvieV\M)(ki n (vi) = l). 



IV. LIMIT CASES: MAXIMUM AND 
MINIMUM UNCERTAINTY 



Let us illustrate our previous results by exploring two 
limit cases, namely DAGs having zero or maximal un- 
certainty. In this section we identify those feed-forward 
structures which, containing n nodes and without a pre- 
defined number of links, minimize or maximize the above 
uncertainties. In this way, for example, a chain having 
m = 1 will display H(Q) = 0, whereas its somehow 
opposite graph, the star having m = n — 1 will have 
H(Q) = log(n — 1). The derivation of the limit scenar- 
ios will be more sophisticated, due to the active role of 
combinatorics in defining the paths. The minimum un- 
certainties are obtained when the graph Q is a special 
kind of tree, to be described below. Afterwards, we also 
derive the graph configuration with maximum entropy. 
The conceptual starting point of this derivation is the 
graph representation of the linear order. 



We thus conclude that there is no uncertainty in recover- 
ing the flow, since the two reported properties are enough 
to conclude that there is 1 and only 1 path to go from M 
to any vi G V \ M. This agrees with the intuitive idea 
that trees are perfect hierarchical structures. 

This result complements the more standard scenario 
of the forward, downstream scenario paths followed by 
a random walker on a tree [16]. It is worth noting that 
evolutionary trees, particularly in unicellular organisms, 
have been found to be a poor representation of the actual 
evolutionary process [36] [37] . 



B. Maximum Uncertainty 

Now we consider the maximum entropic scenario. For 
this purpose, we cut the problem in two pieces: First, we 
constructively obtain the feed forward graph containing 
m maximal nodes maximizing H(Q). Once we identified 
such a feed forward configuration, we ask for the m that 
maximizes such a quantity. 



1. The linear ordering in V\M. 

Let Q be a feed-forward organized graph containing n 
nodes, where m of them are maximal. Since for the en- 
tropy computation all nodes become indistinguishable, 
let g(m,n) be the ensemble of different possible feed- 
forward configurations containing n nodes, where m of 
them are maximal. We are looking for a graph, to be 
written as Q E g(n, m), such that \/Qi E g(m, n): 



Gi^g, 



(24) 



i.e., a graph containing all possible links, preserving the 
number of maximal nodes. This implies, as defined in 
section II A| eq. ([5|, that we must add links to the set 
V \ M until it becomes linearly ordered, attending to a 
labeling of nodes which respect the ordering depicted by 
the feed- forward graph (see fig. [IJ). Once we have the set 
of nodes V\M linearly ordered, we proceed to generate a 
link from any node V{ E M to any node v^ E V \M. We 
thus obtain a feed forward graph containing m maximal 
nodes and only 1 minimal node. In the above constructed 
graph, any new link creates a cycle or destroys a maximal 
vertex. Furthermore, given two fixed values of m and n, it 
is straightforward to demonstrate that it maximizes any 
entropy based on paths: Any feed-forward graph of the 
ensemble g(m,n) other than Q is obtained by removing 
edges of Q. This edge removal process will necessarily 
result in a reduction of uncertainty. 

For the sake of clarity we differentiate the labeling of 
M and V \ M when working with Q. Specifically, nodes 
Vi E V \ M will be labeled sequentially from 1 to n — m 
respecting the ordering defined in eq. Q. This labeling 
will be widely used in the forthcoming sections. Further- 
more, we recall that no special labeling other than dif- 
ferent natural numbers is needed for Vk E M, since there 
will be no ambiguous situations. Given the labeling pro- 
posed above, and starting from eq. (15) the number of 



paths in Q from M to v^ E V \ M will be: 

i n ^)i = E E ( AT ^ii 

j<L(g)l:v t eM 



EE 

livieM j<i 



= m 




= ra-2 



i-l 



(25) 



We can go further, first computing the probabilities defin- 
ing the matrix $(<?). To compute these probabilities, let 
us suppose we are in node V{ E V \ M. The first ob- 
servation is that the probability to reach one maximal is 
— . What about i>i, i.e., the first node we find after the 



maximal set? We observe that, from the node 



the 



situation is completely analogous to the situation where 
there are m + 1 maximal nodes, since the probability to 
pass through v\ does not depend on what happens above 
v\. Therefore: 

1 



m 



and running the reasoning from v\ to i^_i, we find that: 

A* = ^ (k < i). 

Interestingly, for k < z, 0^ is invariant, no matter the 
value of i. This leads matrix $>(G) to be: 



/i 



$(<?) = 



ra+l 
ra+1 





1 

1 

ra+2 



v - J — - J — - J — 

\ 771+1 771 + 2 771 + 3 



\ 







1 / 



(27) 



and the final expression is obtained by observing that 
h L (v k ) =log(ra + fc- 1), 



and therefore, inserting it and (27) into eq. (22), we 
obtain after some algebra: 



H(G) 



f(Vi), 



i<n—m 

where f(vi) is a function / : V \ M —> R + , 
log(m + i — 1) 



f(Vi) 



m - 



(28) 



(29) 



We can see that the value entropy is reduced to the com- 
putation of the average of / over the set V \ M. If Q 
contains n nodes, being m of them the maximal ones we 
will refer to this average as (/(n, m)), defined as: 



(f(n,m)) 



E /(<*) 



(30) 



i<n—m 



2. The explicit form of entropies in the linear ordering of 
V\M. 

We first bound H (Q) using Jensen's inequality. Indeed, 
from eq. (M) we can derive an upper bound for H(Q), 
namely 

H(Q) < logm + l -^(n - m - 1). (26) 



3. Absolute maxima of entropies 

What is the relation between n and m maximizing the 
above entropies? As we shall see, given a fixed value of 
n, the absolute maximum is found in the linear ordering 
above defined at m* = 2, for graphs sizes n ^> 1. To 
support the above claim, let us first notice that: 



Q(G) = Q{G) 



enabling us to derive the first inequality: 

1 



H(G) 



Q(0) 



n-2 



Once we demonstrated that H(Q) > H{Q) 



(31) 



we 



ra=2 



> H(G) 



ra=3 



proceed to demonstrate that H{Q) 

To this end, let us first observe a key property of /, de- 
fined in eq. (29). Indeed, we observe that (Ve > 0)(3k € ) : 

(Vfe > fc c ), 



f(vk) < e, 



(32) 



provided that n is large enough. From this property, and 
since (f(n,m)) is an average -see eq. p0| )- we can be 
sure that (3n*) : (Vn > n*), 



log 2 



></(«, 3)>, 



(33) 



by choosing appropriately n in such a way that we have 
enough terms lower than a given e to obtain the above 
desired result. Thus, from eq. (30) and knowing that 



H{Q) 



H{Q) 



l0 S 2 Iff Q\\ 

_ 3 oc — (/(n,3)), 



(with proportionally factor equal to n/(n — 2)) we can 
conclude that 



H(g) > H(g) 



ra=2 



m=3 



The general case easily derives from the same reasoning, 
since: 



H{Q) 



H{Q) 



log(fe + l) 

ex (/(ra,fc + l)), 

ra=/c+l /C 



and thus, we can conclude that: 



(V* < 2) ff (0) > H(G) 

m=k 



m=k-\-l 



(34) 



This closes the demonstration that Q containing m = 2 is 
the most entropic graph provided that n > 14, according 
to numerical computations. 



causal processes is topologically reversible if we can re- 
cover all causal paths with no other information than the 
one provided by the graph topology. If graph topology 
induces some kind of ambiguity in the backward pro- 
cess, the graph is said to be topologically irreversible, 
and additional information is needed to build the back- 
ward flows. 

We provided the analytical form of the uncertainty (the 
amount of extra information needed) arising in the rever- 
sion process by uncoupling the combinatorial information 
encoded by the graph structure from the contributions of 
the local connectivity patterns of individual nodes, as de- 
picted in eqs. (22, 21). It is worth noting that all our 
results are derived from just two basic concepts: The 
adjacency matrix of the graph and the definition of en- 
tropy. Furthermore, we offer a constructive derivation of 
the two limit cases, namely trees (as the reversible ones), 
and linear ordered graphs (having two maximal nodes) 
as the most uncertain ones. 

According to our results, only a tree DAG is topo- 
logically reversible. However, beyond this singular case, 
the quantification of topological irreversibility by using 
the entropy proposed here could provide insights in the 
characterization of feed forward systems. An illustrative 
case- study can be found precisely in biological evolution. 
The standard view of the tree of life involves a direc- 
tional, upward time-arrow where the genetic structure of 
a given species (its genome) derives from some ancestor 
after splitting (speciation) events. One would think that 
this classical but too simplistic view of evolution as a tree 
gives a topologically reversible lineage of genes, changing 
by mutations and passing from the initial ancestor to 
current species in a vertical inheritance. However, it has 
been recently evidenced that the so-called horizontal gene 
transfer among unrelated species may have had a deep 
impact in the evolution and diversification in microbes 
[37] . According to this genetic mechanism the tree-like 
and thus the logical/topological reversibility is broken 
by the presence of cross-links between brother species. 
At the light of these evidences, tree-based phylogenies 
become unrealistic. In this context, our theoretical ap- 
proach provides a suitable framework for the characteri- 
zation of the logical irreversibility of biological evolution 
and, in general, for any process where time or energy dis- 
sipation impose a feed-forward chart of events. Further 
research in this topic will contribute to understand the 
causal structure of evolutionary processes. 
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